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The Examiner rejected claims 1-4 and 6-14 under 35 U.S.C. § 103(a) as being 
unpatentable over U.S. Patent No. 6,081,780 to Lumelsky (Lumelsky) in view of European 
Patent Application EP 1,271,469 to Marasek et al. (Marasek) in view of U.S. Patent No. 
5,796,916 to Meredith in view of International Application Publication No. WO 02/097590 to 
Cameron. 

We start by addressing the Examiner's rejection of claim 9. The Examiner admits that 
Lumelsky does not disclose two important elements, namely: (1) the inclusion of a speech 
recognition engine for spoken input; and (2) implementing the system on a handheld device. To 
supply the inclusion of a speech recognition engine for spoken input, the Examiner relies on 
Marasek and to supply the idea of implementing the system on a handheld device, the Examiner 
relies on Cameron. But in both cases, we submit that a person of ordinary skill in the art would 
not combine the disclosed technologies in the way proposed by the Examiner because those 
combinations are inappropriate. Furthermore, the combination of Lumelsky's user station with 
Cameron's handheld device that the examiner proposes fails to produce what is claimed. We 
explain these points in greater detail below starting with a discussion about what Lumelsky 
discloses. 

Lumelsky discloses a system for authoring speech content that is stored in a data 
repository and then made available for play back upon request by subscribers at remotely located 
terminals. His "singlecast interactive radio system . . . delivers digitized audio-based content to 
subscribers upon their request." (7:3-5) His personal radio station server "stores multiple 
subscribers' profiles with topics of individual interest, assembles content material from various 
Web sites according to the topics, and transmits the content to a subscriber's user terminal ... on 
the user's request. . ." (7: 16-19) Lumelsky's authoring system provides voice content to 
subscribers whenever they request it, rather like an on-demand radio station. 

The Combination of Lumelsky with Marasek is Improper 

We now discuss the Examiner's combination of Lumelsky's authoring system with the 
speech recognition function of Marasek. In the Office action dated July 18, 2007, the Examiner 
argued that "the motivation to have combined the references allows the extraction of contextual 
features as well as speaker identification." We disagree. The extraction of contextual features 
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discussed by Marasek serves only to assist in speech recognition: ". . .a process of speaker 
identification and/or adaptation can be performed in particular so as to increase the matching rate 
of the feature extraction and/or of the recognition rate of the process of speech recognition." fl| 
[001 5]) But Lumelsky's system has no need for speech recognition because a full text version 
of the speech that is read by the narrators is supplied (Figure 1:113 and Figure 2:121). In 
addition, Lumelsky has no need for speaker identification because the identity of his narrators 
are known: "It is to be understood that the narrator may be a person employed by an 
information/news service provider who is reading a textual representation of the particular data, 
e.g., information or news. . ." (8:56-59) To sum up, neither the extraction of contextual features 
nor speaker identification serve any puipose in Lumelsky's system, and thus a person of skill in 
the art would have no reason to combine Lumelsky's system with Marasek 's speech recognition. 

In the advisory action dated November 6, 2007, the Examiner argues that ". . .although a 
text version [of the narrator's speech] is supplied [in Lumelsky], this is merely used to perform 
speech output based on prosody parameters extracted from the narrator." To the contrary, 
Lumelsky's Figure 2A shows that the speech input is used to generate prosody parameters, and 
not the text input. Indeed nowhere in Figure 2A, nor anywhere else in Lumelsky, is there even a 
hint of speech recognition or the need for it. 

In the advisory action, the Examiner goes on to assert that "the use of the secondary 
reference Marasek, suggests that prosody and speech recognition can be performed on the input." 
We interpret this assertion as arguing the following: because Marasek performs speech 
recognition and also extracts prosodic features from his speech input, it follows that Lumelsky 
would also benefit from performing speech recognition in addition to prosody feature extraction. 
But, unlike Lumelsky, Marasek just has speech input - he has no text inp ut. Thus even if 
Marasek employs speech recognition, this does not mean that Lumelsky has any need for, or 
ability to use speech recognition because Lumelsky already has a text version of the speech 
input, as we discuss above. Or, put another way, the text version of the narrator's speech 
completely obviates the need for a speech recognizer in Lumelsky because that text is precisely 
what a speech recognizer would generate. 

To further support his argument for combining Lumelsky with Marasek, the Examiner 
points out that in Marasek "the use of speech recognition allows semantic relationships and 
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statistical information for speech elements to be obtained (see [0040]). The use of speech 
recognition allows the enhancement of quality of personality description (see [003 1])." But there 
is no hint anywhere within Lumelsky that such features can serve any useful purpose in 
Lumelsky's system. Rather, Lumelsky's authoring system uses both the text input and the 
narrator's voice input to produce composite encoded speech which is stored in the data 
repository. The fact that speech recognition is used in Marasek's system in which no text 
version is available provides no reason for a person of skill in the art to combine Lumelsky with 
such a system. That Marasek's speech recognition system can also obtain other information 
about speech input does not alter this conclusion because it has no effect on the Lumelsky's need 
(or lack thereof) for speech recognition. 

The Combination of Lumelsky with Cameron is Improper 

We now discuss the Examiner's combination of Lumelsky's system with the handheld 
device of Cameron. Cameron discloses speech synthesis implemented on a portable device, such 
as a PDA or a portable telephone. In the Office action dated July 18, 2007, the Examiner 
argues: "It would have been obvious ... to have combined the speech synthesis for an utterance 
as presented by Lumelsky and Meredith by the implementation on a handheld device. The 
motivation to have combined the references in (sic) involves the compression of data from 
spoken information for direct retrieval as well as other tasks are able to be performed." The 
Examiner does not specify whether he is proposing to implement Lumelsky's authoring system 
(Figure 1:101), Lumelsky's user terminal (Figure 1:301), or both the authoring system and the 
user terminal on Cameron's handheld device. But no matter which of Lumelsky's components is 
being referred to, the Examiner's motivation to combine is erroneous because it provides no 
valid reason for a person of skill in the art to implement any of Lumelsky's components on a 
handheld device. 

We now consider combining each of Lumelsky's components in turn with Cameron. 
First, Lumelsky's authoring system (Figure 1:101, Figure 2A.i01) already includes components 
that perform speech compression by combining the text version of the input speech (Figure 
1:113) with prosody parameters (Figure 1:116) extracted from the narrator's voice (Figure 
1:114). Combining Lumelsky's authoring system with Cameron does not in any way help a 
person using Lumelsky's system to compress data from spoken information for direct retrieval 
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because Lumelsky already has this functionality. Furthermore, the Examiner has not pointed to 
anything in Cameron that would improve Lumelsky' s compression of data from spoken 
information for direct retrieval. 

Second, there is no benefit to be gained by combining Lumelsky' s user terminal (Figure 
1:301) with Cameron's handheld system for "compression of data from spoken information for 
direct retrieval" because Lumelsky's user terminal receives compressed speech and decrypts and 
decompresses it for playback to the user. Thus Lumelsky's user terminal has no use for data or 
speech compression, and a person of skill in the art would have no reason to turn to Cameron to 
add such functionality to the user terminal. 

We now consider simultaneously implementing both Lumelsky's authoring system and 
Lumelsky's user terminal in Cameron's handheld system. In this combination, the authoring 
system, which receives the speech input, and the user station, which produces speech output are 
in the same location. But receiving speech and text input and producing corresponding output in 
the same location is completely contrary to the purpose and function of Lumelsky. Lumelsky 
discloses a personal radio station in which the content is authored in one location and at one 
time, and, on demand, the user listens to the speech at another location and at another time. This 
is made clear, for example, in Figure 1, which is a block diagram of Lumelsky's radio station, 
showing the authoring system (101) which is in communication via the Internet with a remote 
user station (301) via a personal radio station server (201) and data repository (401). Lumelsky's 
system could not function as a personal radio station if both the authoring system and the user 
station were implemented in Cameron's handheld device. Therefore this combination is 
improper. 

The Combination of Lumelsky's User Station with Cameron's Handheld Device Fails to Include 
what is Required by the Claim 

In the advisory action, the Examiner argues that Lumelsky's " user terminal for which the 
radio is integrated is mobile and can be placed on cellular phone equipment. It is known that a 
mobile device can be a handheld device as denoted by the definition of mobile device." 
[emphasis added] But Lumelsky's user terminal, even if implemented on a mobile device, falls 
far short of what is claimed. In his rejection, the Examiner relies on Lumelsky to disclose the 
following elements of claim 9: 
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an audio input device that receives a spoken utterance; 

a signal processor that determines one or more prosodic parameters of the spoken 
utterance; 

a speech synthesizer that synthesizes a nominal word from the recognized word; and 
a prosodic mimic generator that receives the synthesized nominal word and the one or 
more prosodic parameters and generates a prosodic mimic word therefrom. ... 

But Lumelsky's user station (Figure 1, ref. 301) lacks all four of these claim elements. First, 
Lumelsky's user station lacks an audio input device because, rather like a radio, it only produces 
audio output . Second, Lumelsky's user station has no signal processor that determines one or 
more prosodic parameters of the spoken utterance. Indeed, it has no access to the spoken 
utterance, and therefore cannot process it to determine prosodic parameters. Third, the user 
station in Lumelsky has no speech synthesizer that synthesizes a nominal word from a 
recognized word. Instead, it produces its speech output by means of a decryptor for the received 
data (Figure 1:310) and a decompression engine (Figure 1:314). Fourth, Lumelsky's user station 
has no prosodic mimic generator. Thus, contrary to what the Examiner argues, implementing 
Lumelsky's user station on a mobile device, such as on a cell phone, would not disclose what is 
recited by the claim. 

In view of the above Applicants believe that claim 9 and claim 1, which contains 
limitations that are comparable to those of claim 9, are allowable. 
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